Search CORE

174 research outputs found

Automatic database acquisition software for ISDN PC cards and analogue boards

Author: Moreno Bilbao M. Asunción
Rodríguez Fonollosa José Adrián
Publication venue
Publication date: 01/01/1998
Field of study

This paper describes an application for automatic speechdatabases acquisition (ADA) developed by the authors in the framework of the EC Telematics Project SpeechDat II. The software is able to work with standard inexpensive PC cards for ISDN lines, as well as Dialogic Boards for analogue telephone lines. Both program versions share a common file format and configuration. Other important characteristics of the recording software are its simple set-up, a fast and flexible configuration of the recording session, the real-time monitoring of calls and disk space, and its proven robustness.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Synthesis using speaker adaptation from speech recognition DB

Author: Bonafonte Cávez Antonio
Moreno Bilbao M. Asunción
Oller Moreno Sergio
Publication venue: Universidad de Vigo
Publication date: 01/01/2010
Field of study

This paper deals with the creation of multiple voices from a Hidden Markov Model based speech synthesis system (HTS). More than 150 Catalan synthetic voices were built using Hidden Markov Models (HMM) and speaker adaptation techniques. Training data for building a Speaker-Independent (SI) model were selected from both a general purpose speech synthesis database (FestCat;) and a database design ed for training Automatic Speech Recognition (ASR) systems (Catalan SpeeCon database). The SpeeCon database was also used to adapt the SI model to different speakers. Using an ASR designed database for TTS purposes provided many different amateur voices, with few minutes of recordings not performed in studio conditions. This paper shows how speaker adaptation techniques provide the right tools to generate multiple voices with very few adaptation data. A subjective evaluation was carried out to assess the intelligibility and naturalness of the generated voices as well as the similarity of the adapted voices to both the original speaker and the average voice from the SI model.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Reconeixement dels dígits catalans utilitzant models de Markov continus

Author: Garrigosa Rivas Sara
Moreno Bilbao M. Asunción
Publication venue: Branca d'Estudiants de l'IEEE de Barcelona
Publication date: 01/01/1993
Field of study

Peer Reviewe

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Revistes Catalanes amb Accés Obert

Reconocimiento de voz multidialectal España - Colombia

Author: Caballero Galeote Mónica
Moreno Bilbao M. Asunción
Publication venue: Escola Tècnica Superior d'Enginyers de Telecomunicació de Barcelona
Publication date: 01/01/2001
Field of study

Peer Reviewe

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Fir system identification using a linear combination of cumulants

Author: Moreno Bilbao M. Asunción
Rodríguez Fonollosa José Adrián
Vidal Manzano José
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1992
Field of study

A general linear approach to identifying the parameters of a moving average (MA) model from the statistics of the output is developed. It is shown that, under some constraints, the impulse response of the system can be expressed as a linear combination of cumulant slices. This result is then used to obtain a new well-conditioned linear method to estimate the MA parameters of a nonGaussian process. The proposed approach does not require a previous estimation of the filter order. Simulation results show improvement in performance with respect to existing methods.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

The strategic impact of META-NET on the regional, national and international level

Author: Ananiadou Sophia
Bel Nuria
Moreno Bilbao M. Asunción
Rehm Georg
Uszkoreit Hans
Publication venue
Publication date: 01/01/2014
Field of study

This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of funding for LT topics. This paper documents the initiative’s work throughout Europe in order to boost progress and innovation in our field.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Monolingual and bilingual spanish-catalan speech recognizers developed from SpeechDat databases

Author: Mariño Acebal José Bernardo
Moreno Bilbao M. Asunción
Nadeu Camprubí Climent
Padrell J
Publication venue: C. Draxler
Publication date: 01/01/2000
Field of study

Under the SpeechDat specifications, the Spanish member of SpeechDat consortium has recorded a Catalan database that includes one thousand speakers. This communication describes some experimental work that has been carried out using both the Spanish and the Catalan speech material. A speech recognition system has been trained for the Spanish language using a selection of the phonetically balanced utterances from the 4500 SpeechDat training sessions. Utterances with mispronounced or incomplete words and with intermittent noise were discarded. A set of 26 allophones was selected to account for the Spanish sounds and clustered demiphones have been used as context dependent sub-lexical units. Following the same methodology, a recognition system was trained from the Catalan SpeechDat database. Catalan sounds were described with 32 allophones. Additionally, a bilingual recognition system was built for both the Spanish and Catalan languages. By means of clustering techniques, the suitable set of allophones to cover simultaneously both languages was determined. Thus, 33 allophones were selected. The training material was built by the whole Catalan training material and the Spanish material coming from the Eastern region of Spain (the region where Catalan is spoken). The performance of the Spanish, Catalan and bilingual systems were assessed under the same framework. The Spanish system exhibits a significantly better performance than the rest of systems due to its better training. The bilingual system provides an equivalent performance to that afforded by both language specific systems trained with the Eastern Spanish material or the Catalan SpeechDat corpus.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

New hos-based parameter estimation methods for speech recognition in noisy environments

Author: Moreno Bilbao M. Asunción
Rodríguez Fonollosa José Adrián
Tortola S
Vidal Manzano José
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1995
Field of study

The problem of recognition in noisy environments is addressed. Often, a recognition system is used in a noisy environment and there is no possibility of training it with noisy samples. Classical speech analysis techniques are based on second-order statistics and their performance dramatically decreases when noise is present in the signal under analysis. New methods based on higher order statistics (HOS) are applied in a recognition system and compared against the autocorrelation method. Cumulant-based methods show better performance than autocorrelation-based methods for low SNRPeer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Comparison of different order cumulants in a speech enhancement system by adaptive Wiener filtering

Author: Masgrau Gómez Enrique José
Moreno Bilbao M. Asunción
Salavedra Molí Josep
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/1993
Field of study

The authors study some speech enhancement algorithms based on the iterative Wiener filtering method due to Lim and Oppenheim (1978), where the AR spectral estimation of the speech is carried out using a second-order analysis. But in their algorithms the authors consider an AR estimation by means of a cumulant (third- and fourth-order) analysis. The authors provide a behavior comparison between the cumulant algorithms and the classical autocorrelation one. Some results are presented considering the noise (additive white Gaussian noises) that allows the best improvement and those noises (diesel engine and reactor noise) that leads to the worst one. And exhaustive empirical test shows that cumulant algorithms outperform the original autocorrelation algorithm, specially at low SNR.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Some robust speech enhancement techniques using higher order AR estimation

Author: Masgrau Gómez Enrique José
Moreno Bilbao M. Asunción
Salavedra Molí Josep
Publication venue
Publication date: 01/01/1994
Field of study

Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC